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^S , Abstract. We propose and justify a new approach to constructing optimal nonlinear trans- 

f~^ ' forms of random vectors. We show that the proposed transform improves such characteristics 

of rank-reduced transforms as compression ratio, accuracy of decompression and reduces re- 
quired computational work. The proposed transform T p is presented in the form of a sum 
with p terms where each term is interpreted as a particular rank-reduced transform. More- 
over, terms in 7~ p are represented as a combination of three operations Tk, Qk and ip k with 
k = 1, . . . ,p. The prime idea is to determine Tk separately, for each k = 1, . . . ,p, from an as- 
^S^ • sociated rank-constrained minimization problem similar to that used in the Karhunen-Loeve 

transform. The operations Qk and tp k arc auxiliary for finding Tk. The contribution of each 
term in T p improves the entire transform performance. A corresponding unconstrained non- 
linear optimal transform is also considered. Such a transform is important in its own right 
because it is treated as an optimal filter without signal compression. A rigorous analysis of 

errors associated with the proposed transforms is given. 
^ , 

Key words: best approximation; Fourier series in Hilbcrt space; matrix computation 
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-^> ■ 1 Introduction 

-I— > 

Methods of data dimensionality reduction [B0ISlllElElIZllHllllIiniIIIlIl2lIIHlIIllIISlIIllIIZlIIHl 

El H3 ED HI H3 IS] have been applied successfully to many applied problems. The diversity of 

applications has stimulated a considerable increase in the study of data dimensionality reduction 

in recent decades. Significant recent results in this challenging research area are described, in 

particular, in references [11510171111111101111111111311111311311311111111011111111 

The known methods concern both a probabilistic setting (as in [3 13 El H El El El El 

E1H3IHJI11I13I11I13I13I13I1E]) ^ determhlistiC setting (as in [II El El Ej) in the 
dimensionality reduction. The associated techniques are often based on the use of reduced-rank 

operators. 

In this paper, a further advance in the development of reduced-rank transforms is presented. 
We study a new approach to data dimensionality reduction in a probabilistic setting based on 
the development of ideas presented in [5) In] I7| UE \ I27 | I2%1 123] . 

Motivation for the proposed approach arises from the following observation. In general, the 
reduced-rank transform consists of the three companion operations which are filtering, compres- 
sion and reconstruction [3 H UJ El HE] • Filtering and compression are performed simultaneously 
to estimate a reference signal x with m components from noisy observable data y and to filter 
and reduce the data to a shorter vector x with r/ components, rj < m. Components of x are 
often called principal components |3j. The quotient r]/m is called the compression ratio. Re- 
construction returns a vector x with m components so that x should be close to the original x. 
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It is natural to perform these three operations so that the reconstruction error and the related 
computational burden are minimal. 

As a result, the performance of the reduced-rank transform is characterized by three issues 
which are (i) associated accuracy, (ii) compression ratio, and (hi) computational work. 

For a given compression ratio, the Karhunen-Loeve transform (KLT) [SJ |SJ [7j minimizes the 
reconstruction error over the class of all linear reduced-rank transforms. Nevertheless, it may 
happen that the accuracy and compression ratio associated with the KLT are still not satisfac- 
tory. In such a case, an improvement in the accuracy and compression ratio can be achieved by 
a transform with a more general structure than that of the KLT. Special non-linear transforms 
have been studied in HH1 HZl HH1 HH1 HOI EH HH H31 H31 HH1 HH1 IZ3 HH1 H3 E01 EH E21 E31 E3I 
using transform structures developed from the generalised Volterra polynomials. Nevertheless, 
the transforms |161 1261 1271 1281 I29| imply a substantial computational burden associated with the 
large number N of terms required by the underlying Volterra polynomial structure. 

Our objective is to justify a new transform that may have both accuracy and compression 
ratio better than those of the known transforms [5l 16} I7| l26"l I27 | 1281 I29j . A related objective is 
to find a way to reduce the associated computational work compared with that implied by the 
transforms [2^1 E3 EH1 1211 • The analysis of these issues is given in Sections EJ 15.2.21 (Remark HJ) , 

Erniandism 

In Section T5.2.51 we show that the proposed approach generalizes the Fourier series in Hilbert 
space, the Wiener filter, the Karhunen-Loeve transform and the transforms given in |26M27ll2"§] . 

2 Method description 

We use the following notation: 

(fi, S, (j,) is a probability space, where ft = {cu} is the set of outcomes, £ a cr-field of mea- 
surable subsets of ft and \i : £ — ► [0, 1] an associated probability measure on T, with /i(ft) = 1; 
x G L 2 (ft,IR m ) and y G L 2 (ft,M n ) are random vectors with realizations x = x(lo) G W 71 and 
y = yiuj) G M n , respectively. 

Each matrix M G W mxn defines a bounded linear transformation M : L 2 (ft, R n ) — > L 2 (ft, R m ) 
via the formula [.My] (a;) = My{co) for each u G ft. We note that there are many bounded 
linear transformations from L 2 (ft,R n ) into L 2 (ft,M m ) that cannot be written in the form 
[My](u>) = My(u) for each uj G ft. A trivial example is A : L 2 (n,R n ) -> L 2 (ft,R m ) given 



by A(y) = / y(u)dn{u). 
Jn 
Throughout the paper, the calligraphic character letters denote operators defined similarly 

to M. 

Let g = [g 1 . ..g m ] T G L 2 (ft,M m ) and h = [h\ . . . h n ] T G L 2 (n,K n ) be random vectors with 

gi,hk G L 2 (ft,lR) for i = 1, . . . ,m, k = 1, . . . ,n. For all i = 1, . . . ,m and k = 1, . . . ,n, we set 

E[g t ] = / g 4 (w)d/i(w), Efahk] = / gi(v)h k {uj)dn(uj), (1) 

Jn Jn 

E gh = E[gh T ] = {E[ gi h k }} G R mxn and E g = E[g] = {E[ 9i }} G R m (2) 

We also write 

® 9 h = E[(g- E g ){h - E h ) T ] = E gh - E[g]E[h T ]. 

Achievement of the above objectives is based on the presentation of the proposed transform in 
the form of a sum with p terms (J2J) where each term is interpreted as a particular rank-reduced 
transform. Moreover, terms in (jSJ) are represented as a combination of three operations J-k, 
Qk and tp k for each k = 1, . . . ,p, where <p k is nonlinear. The prime idea is to determine T k 
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separately, for each k = 1, . . . ,p, from an associated rank-constrained minimization problem 
similar to that in the KLT. The operations Q k and <p k are auxiliary for finding J- k . It is natural 
to expect that a contribution of each term in (J3J will improve the entire transform performance. 
To realize such a scheme, we choose the Q k as orthogonal/orthonormal operators (see Sec- 
tion |2J). Then each T k can be determined independently for each individual problem l)33jl 
or ()56|) below. Next, operators <p k are used to reduce the number of terms from N (as in 
|161 1261 1271 1281 129| ^ to p with p <C N. For example, this can be done when we choose (p k in the 
form presented in Section T5.2.41 Moreover, the composition of operators Q k and (p k allows us to 
reduce the related covariance matrices to the identity matrix or to a block-diagonal form with 
small blocks. Remark |1] in Section 15.2.21 gives more details in this regard. The computational 
work associated with such blocks is much less than that for the large covariance matrices in 

HEl EE1 EZl EHl EH1 

To regulate accuracy associated with the proposed transform and its compression ratio, we 
formulate the problem in the form ©-0 where (J7J) consists of p constraints. It is shown 
in Remark El of Section 0J and in Sections 15.2.11 15.2.21 and 15.2.41 that such a combination of 
constraints allows us to equip the proposed transforms with several degrees of freedom. 

The structure of our transform is presented in Section El and the formal statement of the 
problem in Section 0] In Section [SJ we determine operators Q k and T k (Lemmata ^ and [3 and 
Theorems H an d El respectively). 

3 Structure of the proposed transform 

3.1 Generic form 

The proposed transform T p is presented in the form 
p 

My) = f + J2^ Q ^k(v) = f + FiQi<pi(v) + ■■■ + fpQp<p P (y), (3) 

fc=i 

where / G R m , cp k : L 2 {n,R n ) -► L 2 (Q,R n ), Qi,...,Q p : L 2 (n,R n ) -» L 2 (n,R n ) and T k : 
L 2 (n,R n ) -> L 2 (n,R m ). 

In general, one can put x G L 2 (Q,Hx), y G L 2 (i},Hy), <fk '■ L 2 (Q,Hy) — > L 2 (£l,H k ), 
Q k : L 2 {n,H k ) -» L 2 {n,H k ) and T k : L 2 (Q,H k ) -» L 2 (fl,H x ) with H x , H Y , H k and H k 
separable Hilbert spaces, and k = 1, . . . ,p. 

In (|5|). the vector / and operators J-\, . . . ,J- p are determined from the minimization problem 
©~0 given in the Section 0J Operators Qi, ■ ■ ■ , Q p in © are orthogonal (orthonormal) in the 
sense of the Definition Q in Section 0] (in this regard, see also Remark El in Section [5.1JI . 

To demonstrate and justify flexibility of the transform T p with respect to the choice of 
ip 1: . . . , (p p in ©, we mainly study the case where cp 1 , . . . ,cp are arbitrary. Specifications of 
(p l , . . . , (f p are presented in Sections 13.21 15.2.41 and 15.2.51 where we also discuss the benefits 
associated with some particular forms of (fii, . . . , <p p . 

3.2 Some particular cases 

Particular cases of the model T p are associated with specific choices of <p k , Q k and T k . Some 
examples are given below. 

(i) If H x = H Y = R n and H k = H k = R nk where R nk is the fcth degree of R n , then © 
generalises the known transform structures QH H3 EH H3 US] ■ The models [TBI UE\ U7\ I2H ) [2l?j 
follow from Q if fkiv) = V k where y k = (y, . . . ,y) G L 2 (£l,R nk ), Q k = 1, where 1 is the 
identity operator, and if T k is a /c-linear operator. It has been shown in ^3 EHJ EH EH1 EH] that 
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such a form of tp k leads to a significant improvement in the associated accuracy. See Section l5,2.5l 
for more details. 

(ii) Jitpii : L 2 (Q,Hy) — ► L 2 (Q,Hx) and {u\,U2, . . .} is a basis in L 2 (Q,Hx) then ip k and Qk 
can be chosen so that <p k (y) = u k and Q k = I, respectively. As a result, in this particular case, 

%{y) = f + E ?k{u k ). 

fc=i 

(iii) A similar case follows if ip k : L 2 (S7, Hy) — > L 2 (S7, H k ) is arbitrary but Qk '■ L 2 (Q, H k ) — > 
L 2 (S1, Hk) is defined so that Qk[^Pk{y)\ = v k with k = 1, . . . ,p where {v\, V2, . . .} is a basis in 

L 2 (n,H k ). Then T p (y) = / + £ F k {v k ). 

fc=l 

(iv) Let a;' , . . . , a;' p ^ be estimates of x by the known transforms [7l l25ll5Uj . Then we can put 
<fx{y) = x^ 1 ', . . . , <p p (y) = x^ p >. In particular, one could choose <P\(y) = y- In such a way, the 
vector x is pre-estimated from y, and therefore, the overall x estimate by T p will be improved. 
A new recursive method for finding xS ',..., x^ p > is given in Section [5.2.41 below. 

Other particular cases of the proposed transform are considered in Sections 15.2.41 and 15.2.51 

Remark 1. The particular case of T p considered in the item (iii) above can be interpreted as 
an operator form of the Fourier polynomial in Hilbert space |35| . The benefits associated with 
the Fourier polynomials are well known. In item (ii) of Section [5.2.51 this case is considered in 
more detail. 

4 Statement of the problem 

First, we define orthogonal and orthonormal operators as follows. 

Definition 1. Let u k € L 2 (r2,R n ) and v k = Qk(u k ). The operators Qi, ■ ■ ■ ,Q P are called 

f © i ^ i 
pairwise orthonormal if E„.„. = < ' . .' for any i,j = 1, . . . ,p. Here, © and / are the zero 

matrix and identity matrix, respectively. If M ViV . = O for i ^ j with i,j = 1, . . . ,p, and if 
E„ ii; . is not necessarily equal to / for i = j then Qi, ■ ■ ■ ,Q P are called pairwise orthogonal. 

Hereinafter, we suppose that J- k is linear for all k = 1, . . . ,p and that the Hilbert spaces are 
the finite dimensional Eucledian spaces, Hx = K m and Hy = H k = H k = W 1 . For any vector 
geL 2 (n,R m ), we set 



E[\\g\\ 2 }= / \\g(u>)\\*dn(«>) < oo, (4) 

Jn 

where ||gf(u;)|| is the Euclidean norm of g(to). 
Let us denote 

J(f,r 1 ,...T p ) = E[\\x-T p (y)\\ 2 ]. (5) 

The problem is 

(i) to find operators Qi, ■ ■ ■ , Q p satisfying Definition^ and 

(ii) to determine the vector /° and operators J-®, . . . , J- p such that 

J(/ ,fi°,..^)= min J(/,f lr .,f p ) (6) 

j,j-i,...,j- p 

subject to 

rank .T 7 ! = 7/1 , ..., rank.F p = r] p , (7) 

where rji + ■■• + % = t] < min{m, n}. 
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Here, for k = 1, . . . ,p, (see, for example, |H| ) 

rank(J^) = dim T k {L 2 (Q,R n )). 
We write 



7~» = /° + E-^) ( 8 ) 



k=X 

with v k defined by Definition^ 

It is supposed that covariance matrices formed from vectors Qi<Pi(y), • • • , Q.p<P p (y) in © are 
known or can be estimated. Various estimation methods can be found in [51)11571 EM EM HOI l4"T] . 
We note that such an assumption is traditional ^0131101313111111101111111211131111131111 
in the study of optimal transforms. The effective estimate of covariance matrices represents a 
specific task [33 E7JIS3 CUB El EI] which is not considered in this paper. 

Remark 2. Unlike known rank-constrained problems, we consider p constraints (J7J). The num- 
ber p of the constraints and the ranks rji, . . . ,rj p form the degrees of freedom for 71°. Variation 
of p and 771 , . . . , rjp allows us to regulate accuracy associated with the transform 71 (see (1211) in 
Section 15.2.11 and 1J4*0|) in Section I5.2.2J) and its compression ratio (see (|d7|) in Section 15.2.4(1 . It 
follows from (|21[) and (|46|) that the accuracy increases if p and rji, . . . ,rj p increase. Conversely, 
by (|67j) . the compression ratio is improved if 771, . . . ,r] p decrease. 

5 Solution of the problem 

The problem © generalises the known rank-constrained problems where only one constraint 
has been considered. Our plan for the solution is as follows. First, in Section 15.11 we will 
determine the operators Qi, . . . , Q p . Then, in Section 15.21 we will obtain f° and T®, . . . , J 7 ® 
satisfying © and (Q). 

5.1 Determination of orthogonalizing operators Q l5 . . . , Q p 

If M is a square matrix then we write M 1 ' 2 for a matrix such that M l ' 2 M 1 ' 2 = M. We note 
that the matrix M 1 ' 2 can be computed in various ways [12]. In this paper, M 1 ' 2 is determined 
from the singular value decomposition (SVD) [33] of M. 

For the case when matrix ~E Vk v k is invertible for any k = 1, . . . ,p, the orthonormalization 
procedure is as follows. For u k € L 2 (Q,]R n ), we write 

[Q k (u k )](uj) = Q k u k (uj), (9) 

whereQ k <G R nxn . For u k , Vj , Wj € L 2 (J7,R n ), we also define operators S UkVj , S~\. : L 2 (0,]R n )^ 
L 2 (0,R n ) by the equations 

[^ufc«i ( w j)} M = E « fc ^ W J M and fc^ (™ j )] M = E ^A- ™i M . ( 10 ) 

respectively. 
Lemma 1. Let 

it»i = ui and Wj = Uj - y~] £ujw h £wlw h ( w k) for i = l,...,p, (11) 



k=l 



where £ w } w , exists. Then 
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(i) the vectors w\, . . . ,w p are pairwise orthogonal, and 
(ii) the vectors V\,..., v p , defined by 

Vi = Qi(Ui) (12) 

with 

Q t ( Ui ) = (4SJ _1 (^) (13) 

for i = 1, . . . ,p, are pairwise orthonormal. 
Proof. The proof is given in the Appendix. ■ 

For the case when matrix E VkVk is singular for k = 1, . . . ,p, the orthogonalizing operators 
Sij • • • j Qp are determined by Lemma El below. Another difference from Lemma ^ is that the 
vectors v±, . . . ,v p in Lemma |3] are pairwise orthogonal but not orthonormal. An intermediate 
result is given in Lemma [21 

The symbol f is used to denote the pseudo-inverse operator 0^1 . It is supposed that the 
pseudo-inverse M^ for matrix M is determined from the SVD of M. 

Lemma 2 ([26]). For any random vectors g G L 2 (0,M m ) and h 6 L 2 (&,,R n ), 

EghK^Ehh = E gh . (14) 

Lemma 3. Let V{ = Qi(uj) for i = 1, . . . ,p, where Q±, . . . , Q p are such that 

i-l 

Ql(«i)=w 1 and Q i (ui) = u i - s ^Z ik (v k ) for i = 2,...,p (15) 

fe=i 

with Z ik : L 2 (tt,R n ) -v L 2 (n,R n ) defined by 

Z lk = E UtVk Et kVk + A lk (I - E VkVk El k J (16) 

with A{ k € W nxn arbitrary. Then the vectors V\, . . . ,v p are pairwise orthogonal. 
Proof. The proof is given in the Appendix. ■ 

We note that Lemma 01 does not require invertibility of matrix E VkVk . At the same time, if 
E~ k \ k exists, then vectors w±, . . . , w p and v±,...,v p defined by ((TTj) and Lemma El respectively, 
coincide. 

Remark 3. Orthogonalization of random vectors is not, of course, a new idea. In particular, 
generalizations of the Gram-Schmidt orthogonalization procedure have been considered in |46| 
147] . The proposed orthogonalization procedures in Lemmata ^ and |31 are different from those 
in |46| I47j . In particular, Lemma El establishes the vector orthogonalization in terms of pseudo- 
inverse operators. A particular case of the practical implementation of the random vector 
orthogonalization is considered in Section El 
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5.2 Determination of /°, :F°, . . . , T® satisfying ©-(0) 

5.2.1 The case when matrix K ViVi is invertible for i = 1, . . . ,p 

We consider the simpler case when E ViUj is invertible for all i = 1, . . . ,p. Then the vector /° 
and operators J-f, . . . ,J-9 satisfying ©-© are defined from the following Theorem ["3 For each 
i = 1, . . . ,p, let UiZiV? be the SVD of E^., 

U i Z l V? = E XVt , (17) 

where Ui € R mxn , Vi € R nxn are orthogonal and £$ € M nxn is diagonal, 

Ui = [Sjl, . . . , s in ], ^ = Ki ; ■ ■ ■ > rf m] and s « = dia § ( Q ii' ■ ■ ■ ' a "0 (18) 

with an > • • • > aj r > 0, a^r+i = • • • = «i n = and r = 1, . . . , n where r = r(i). We set 

U iVi = [sa ,... , s iVi ], V iVi = [dix ,... , d iVi ] and S ir?i = diag(a;i , . . . , a iVi ) , 

where U irji € M mx?? % V irh £ R nx '« and E irH e M*****. Now we define K irji £R rax " and /C % : 

L 2 (n,R n )^ L 2 (n,W m ) by 

K iri . = U im Y, im V?. and [^ (iOi)] (w) = i^Ju^w)], (19) 

respectively, for any iuj E L 2 (£l,M n ). 

Theorem 1. Let vx,...,v p be determined by Lemma[I\ Then the vector f° and operators 
J^i, . . . , Tp, satisfying (j""")-©, are determined by 



f = E[x]-Y j F^E[v k \ and ^ = /C l7?1 , ..., T° p = IC PVp . 



(20) 



fe=i 



The accuracy associated with transform 71°, determined by (jHJ) and (|2U|) . is given by 



V Vk 



E[\\ x -V(y)f] = \\KL 2 \\ 2 -EY, a 



2 



(21) 



k=X j=l 

Proof. The functional J(f, J~x, ■ ■ ■ , J- p ) is written as 

p 



J(/,J"i,...,^p) = tr 



E xx - E[x]f T -Y^E XVi FT - fE[x T ] + ff + fY,E[vJ]Fl 



i=X 



i=l 



Y^F i E ViX + Y^F i E[v i ]f T + E\ Y^FiiVi 



i=X 



i=X 



i=X 



5Z^i(»i 



A=l 



• (22) 



We remind (see Section |2J) that here and below, Fi is defined by [Ti(vi)](uj) = Fi[vi(u)] so that, 
for example, E[Tk{vh)x^\ = FkE VkXk . In other words, the right hand side in ((2*2*]) is a function 
of/, Tx,-..,T P . 

Let us show that «/(/, ^"i, . . . , .Fp) can be represented as 



J{fi Fit - ■ - j ■?>) — ^o + ^i + J2, 



(23) 



where 



J = ||Ei/ 2 || 2 -^||E a; „J 

i=l 



(24) 
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J l = \\f-E[x\ + Y j F l E[v i }f and J 2 = J^ ||F 4 - E OT J| 2 . (25) 

j=l i=l 

Indeed, J\ and Ji are rewritten as follows 

h = tr(ff T - fE[x T ] + J2 fE[vJ]F t + E[x]E[x T ] - E[x]f T - £ E[x]E[vJ]F? 

V t=i *= 

p p p p 

+ J] F^[ Vi ]/ T - ]T F^]^] + ^ F^] £ £[„£]*; 



*=l 



f 



i=l 



j=l 



i=l 



fc=l 



and 



i=l 



i=i 



In l|27p. ^ tr (FjFj T ) can be represented in the form 



«=i 



j^tr(F^ T )=tr 



j=i 
because 



E\ J>i«i5>£if 



i=i fc=i 



trlj^FiEivi^EivZ]?, 



i=l 



fc=l 



£[«<«£] - E[vi]E[vl] 



D, i^k, 
I, i = k 



due to the orthonormality of vectors V\, . . . ,v p . 
Then 



^Vi'X\ 



Jo + Ji + J 2 = tr{E xx - E[x]E[x T ]) - ^ trpE^E, 

i=l 

+ tr ( ff T - fE[x T ] + J2 fE[vf]F t + E[x]E[x T ] - E[x]f 
- fl E[x]E[vJ]FT + J2 FiE^f - J2 F t E[v t ]E[x T ] 



i=l 



i=l 
P 



e\ E^E^ 



j=i fe=i 



tf 



(26) 



J 2 = J2 tr (F - E OT J(F i T - E,,.,.) = E tr (^ T " F * E ^ " K ^ F ? + E^E^). (27) 



(28) 



(29) 



(30) 



+ Y J FE[v l ]Y J E[v T k \Fn +tr 
j=l fc=l / 

-Jj^FiEMJ^E^Fl 
\ i=i fc=i / 

p 
- E MFF^ - FiE[vi]E[x T ] + S^if - E[x}E[vJ]FT - E XVi E v . 
i=i 

= J(f>Fi, ■ ■ ■ >•?>)• 
Hence, (|23|) is true. Therefore, 

p p 

J(/,^i, • • • , T p ) = \\E l J x 2 f - E ||Ex„ fc || 2 + 11/ - E[x] + E F k E[v k ]\\ 2 



(31) 



fe=i 



fe=i 
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+ J2\\F k -E XVk \\ 2 . (32) 

fc=l 

It follows from (|32j) that the constrained minimum ©~(0) is achieved if / = /° with /° given 
by PU)I. and if i 7 ^ is such that 



Jfc(-F fc ) = minJ fc (F fc ) subject to rank (F fc ) = r/ fe , (33) 

where J fe (F fe ) = \\F k - E XVk \\ 2 . The solution to (J33J) is given @3] by 

Fk=K kVk . (34) 

Then 

£[11* - T»|| 2 ] = ||EV 2 || 2 - ^(p^ll 2 - \\K kr)k - E XVk f). 

fc=i 

Here @3], 

r r 

W^xvJ 2 = J2a 2 kj and \\K kr)k - E XVk f = ^ a^ (35) 

with r = r(k). Thus, 1)21(1 is true. The theorem is proved. ■ 

Corollary 1. Let V\,...,v p be determined by Lemma[I] Then the vector f and operators 
J-\, . . . ,T V satisfying the unconstrained problem ©, are determined by 

v 
f = E[x] - ^2 F kE[v k ] and T\ = 6 XVl , ..., f v = £ XVp (36) 

fc=i 

with T k such that [T k (v k )](uj) = F k v k (ui) where F k € ]R Tlxm and k = 1, . . . ,p. 
The accuracy associated with transform 7~ p given by 

p 
%(y) = f + Y,F k (v k ) (37) 

fc=l 

is such that 

E[\\x - f p (y)\\ 2 ] = ||EV 2 || 2 - j^ \\E XVk \\ 2 . (38) 

fc=l 

Proof. The proof follows directly from (|32l) . ■ 

5.2.2 The case when matrix E VhVk is not invertible for k = 1, . . . ,p 

We write A k £ j£ mxri f or an arbitrary matrix, and define operators A k : L 2 (£l,M. n ) — > L 2 (Q,R m ) 
and £ Vk v k ,£v h v k , (sUlrf ■ L 2 (n,R n ) -> L 2 (n,R n ) similarly to those in © and (fTU|). 

For the case under consideration (matrix E VkVk is not invertible), we introduce the SVD of 

E X v k {Ev k v k ) , 



U k Z k V k T = E XVk (El{ 2 Vk )l, (39) 
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where, as above, Uk G M. mxn , Vk G W ixn are orthogonal and £& G M. nxn is diagonal, 

Uk = [ski, ■ ■ ■ , Skn], V k = [d k i,...,dkn] and Y, k = dia,g((3 k i, ■ ■ ■ , Pkn) (40) 

with 0kl>--->Pkr> 0, /3fc,r+i = • • • = Pkn = 0, T = 1, . . . , n and r = r(/c). 
Let us set 

Uk Vk = [sfei, ■ ■ ■ , SjbjJ, Vfc^. = [dfci, • • • , rffe?7 fe ] and 

£ fc% = diag (J3 k i , ■ ■ ■ , ^fc % ) , (41) 

where U kr , k G R mXl », V fc% G M nxr ' fc and S fc% G ]R%><%. Now we define G kr]k G K mxn and 
& % :L 2 (0,M n )^L 2 (0,M m )by 

G kVk = Uk Vk ^kr 1k Vkr 1k and [£fc % («>fc)]H = G fc% [w fc (u;)], (42) 

respectively, for any to*. G L (fi,R n ). 

As noted before, we write X for the identity operator. 

Theorem 2. Let v±, ... ,v p be determined by Lemma^ Then f° and JF°, . . . ,J^, satisfying 
(|5 |l -([7 | l. are determined by 

f° = E[x]-Y,*$E[v k ] (43) 

fe=i 

and 

^N^im^J^AfX-^^jt], (44) 



J~p ~ =^PVp^VpVp) + >Ap[-L c VpVp {t VpVp ) J, (45) 

where for k = 1, . . . ,p, Ak is any linear operator such that rank^? < r/k 1 ■ 

The accuracy associated with transform 7~° given by (jHJ and ()43(l - (|45(l is such that 

V Vk 

E[\\x - T»|| 2 ] = ||EV a f - EE^V ( 46 ) 

fc=l j=l 

Proof. For v\, . . . , v p determined by Lemma |3 J(f, T\,..., E p ) is represented by (|2*2*|) as well. 
Let us consider J , J\ and J 2 given by 

Jb = ||E^|| 2 -i;fc fc (Ki&) t ll 2 , (47) 

fe=i 

p p 

Ji = ||/ - E[x] + Y,F k E[v k ]\\ 2 and J 2 = £ ll^Xg, - E OTfe (EV 2 jtf . ( 48 ) 
fc=i fe=i 

To show that 

J(/,^ 1 ,...,X P ) = J + Ji + J 2 (49) 

with J(/, .Fi, . . . ,Xp) defined by ((2*2*|) . we use the relationships (see fZQ ) 

E XVk El kVk E VkVk = E XVk and K k v k K%, k = (KHJ (50) 



1 In particular, A k can be chosen as the zero operator. 
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Then 

(p v 

ff T - fE[x T ] + £ fE[v T k ]F k + E[x]E[x T ] - E[x]f T - £ E[x]E[v%]f£ 
k=i fc=i 

V P P V \ 

+ ^ F fc E[v fc ]/ T - ]T Ffc^]^ 7 "] + ^ F fe F[<; fc ] ^ £[«f ]*f (51) 

fc=l fc=l fc=l i=l / 

and 



J 2 = Y / ^(F k -E XVk Ei k jE VkVk (F k T -El kVk E VkX ) 
fc=l 
V 

= £ tiiF k E VkVk F£ - F k E VkX - E XVk F? + E^E^E^), (52) 



\fc=l i=l 



tr ^F fc FK]^^f]F T (53) 



, fc=l i=l 



fc=l 

where 

p 
£tr(F fc E^ fe F fc T )=tr 
fe=l 

because 

£[«ii;fc] - E[vi]E[v%] = O for i^fc (54) 

due to orthogonality of the vectors vi, ■ ■ ■ ,v s . On the basis of (|5U |) -(|53j l and similarly to (|3U|1 - 
(|31|). we establish that l)I5jl is true. Hence, 



J(/,^ 1 ,...,^) = ||EV 2 || 2 -^||E x ,. Ufc (Eig fe )t||2 + || / _ jE[:E] 

fe=i 
p p 

+ £F fe F[« fe ]|| 2 + £||F fc Ejg fc -E OTfc (Eigjt||2. (55) 

fe=l fe=i 

It follows from the last two terms in (|55|) that the constrained minimum ©-0 is achieved if 
f = f° with /° given by (JJ3")) . and F° is such that 

4(F°) = minJ fc (F fc ) subject to rank (F k ) = rj k , (56) 

-Ffc 

where Jk(F k ) = \\F k E VkVk — E XVk (E VkVk )^\\ 2 . The constrained minimum ©-(J7J) is achieved if 
/ = /° is defined by ©, and if |S] 

F fc Ej/5 h = G, k . (57) 

The matrix equation (|57|) has the general solution [ISj 

F fc = F° = G Vk (El{l k y + A k [I- Elil k (El/ k lJ] (58) 

if and only if 

0^111)^111= G m . (59) 

The latter is satisfied on the basis of the following derivation 2 . 

Note that the matrix I — E Vk v k (E^J* is simply a projection onto the null space oiK VkVk and can be replaced 

by/-E„ fc „ fc (E„ fc „jt. 
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As an extension of the technique presented in the proving Lemmata 1 and 2 in |5B], it can 
be shown that for any matrices Qi, Q2 € M. mxn , 

A/"(Qi)CA/"(Q 2 ) => Q 2 (I - Q\Qi) = O, (60) 

where N(Qi) is the null space of Qi for i = 1,2. In regard of the equation under consideration, 

AA([Eigjt)cAA(E x .„jEigjt). (61) 

The definition of G^ implies that 

Af(E XVk [El( 2 J) C AA(G % ) and then ^([E^gjt) c M(G Vk ). 

On the basis of (jHOJ, the latter implies G Vk [I - (E^JtE^J = O, i.e. (J5U) is true. Hence, (J2E|) 
and l)H)) - l)15|l are true as well. 
Next, similar to (|35l) . 

r?ft 

HE HE 1 / 2 ^ll 2 - 11(7 -E (E 1 / 2 )U\ 2 — S^ 6? ■ (62) 

Then gSJ) follows from jEJ, ©, © and©. ■ 

Remark 4. The known reduced-rank transforms based on the Volterra polynomial structure |161 
1271 I29j require the computation of a covariance matrix similar to E OT , where i> = [vi, ... , Up] 71 , 
but for p = N where N is large (see Sections ^ and EJ). The relationships l|3U|) - (|33j) and (|51 |) -(|56j ) 
illustrate the nature of the proposed method and its difference from the techniques in |161l27ll2*§] : 
due to the structure © of the transform T p , the procedure for finding /°, JF^ , . . ., T^ avoids 
direct computation of K vv which could be troublesome due to large N. If operators Qi, ■ ■ ■ ,Q P 
are orthonormal, as in Theorem^ then (|29|) is true and the covariance matrix E„„ is reduced to 
the identity. If operators Qi, ■ ■ ■ , Q p are orthogonal, as in Theorem [21 then (|54j) holds and the 
covariance matrix E„„ is reduced to a block-diagonal form with non-zero blocks E t , lW1 , . . . , E t , v 
so that 



E 



E 



E 



^VpVp 



with O denoting the zero block. As a result, the procedure for finding /°, J^, . . . , J-^ is reduced 
to p separate rank-constrained problems (|33j) or ()56|) . Unlike the methods in |161 1271 I29j , the 
operators J-i, . . . , T§ are determined with much smaller m x n and n x n matrices given by the 
simple formulae ()2Uj) and (|43j) - (|45|) . This implies a reduction in computational work compared 
with that required by the approach in J27J HHJ |33| ■ 

Corollary 2. Let Vi,...,v p be determined by Lemma]^ Then the vector f and operators 
f~i, . . . ,F P , satisfying the unconstrained minimum ©, are determined by 

p 
f = E[x]-Y,FkE[v k ] (63) 



fc=i 



and 



Fi = £ XV1 4 m +A 1 [l- £ V1V1 4 w ] , (64) 
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•>%> — t-xvpt-vpVp + •Ap[-L E- VpVp c VpVp \. (bo) 

The associated accuracy for transform T p , defined by 

v 

T P (y) = f + J2Mvk), 
k=x 

is given by 

E[\\x-T p (y)\\ 2 ] = ||E^|| 2 -X;||E^(Eigjt||2. (66) 

fc=i 

Proof. It follows from (|55|) that the unconstrained minimum (JBJ is achieved if / is defined 
by (j63|) and if F k satisfies the equation F k K VkVk — ^xv k (^v k v k )^ = O for each k = l,...,p. 
Similar to (|57|)-(|58jl. its general solution is given by 

F k = F k = ^vM k v k +MI- K h v h K h v h ] 

because E v ' kVk (E v / kVk )' f = E VkVk El kVk . We define T k by \T k {w k )\{uj) = F k [w k (cu)] for all k = 
1, . . . ,p, and then (|64"|) - (|65|) are true. The relation (|66|) follows from (|55|) and (|63|) - (|65|) . ■ 

Remark 5. The difference between the transforms given by Theorems ^ an d|2] is that J-2 by 1)20(1 
(Theorem does not contain a factor associated with (K VkVk y for all k = 1, . . . .p. A similar 
observation is true for Corollaries ^ an d 12 

Remark 6. The transforms given by Theoremsn an d|21 are n °t unique due to arbitrary operators 

Ax, ■ ■ ■ , A p . A natural particular choice is Ax = ■ ■ ■ = A p = O. 

5.2.3 Compression procedure by T® 

Let us consider transform 71° given by ©, (|43|) - (|45|) with A k = O for k = 1, . . . ,p where A k is 
the matrix given in (|o^|). We write [T p ° (y)](oj) = T°(y) with T p ° : R n -> R m . 
Let 

B? = S krik V krik D T kVk and flf = D T krjh {¥}£J 
so that B { k 1] 6 R™** and Bf ) G R^*™. Here, r/i, . . ., t? p are determined by 0. Then 

^(y) = f + ±B^B^v k , 
k=x 

(2) 

where v k = v k (u>) and B k v k G W 111 for A: = 1, . . . ,p with 771 + • • • + 7] p < in. Hence, matrices 

B{ , . . . ,Bp perform compression of the data presented by vx, ■ ■ ■ ,v p . Matrices B\ ,... ,B P 
perform reconstruction of the reference signal from the compressed data. 
The compression ratio of transform 71° is given by 

r° = (r/i-\ h rj p )/m. (67) 
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5.2.4 A special case of transform T p 

The results above have been derived for any operators <pi, . . . ,<p p in the model T p . Some 
specializations for (fi, . . . ,(p p were given in Section 1.3.21 Here and in Section 1,5. 2. ,51 we consider 
alternative forms for (p 1 , . . . , <p p . 

(i) Operators (p±, . . . ,(p p can be determined by a recursive procedure given below. The 
motivation follows from the observation that performance of the transform T p is improved if y 
in ||SJ) is replaced by an estimate of x. 

First, we set <^>k{y) = y and determine estimate x^ l > of x from the solution of problem © 
(with no constraints Q) by Corollaries Q] or |2| with p = 1. Next, we put tp\(y) = y and 
y? 2 (y) = #' , and find estimate x^ 2 ' from the solution of unconstrained problem (jHJ) with 
p = 2. In general, for j = 1, . . . ,p, we define <fij{y) = sc^' -1 ', where x^' -1 ' has been determined 
similarly to a;( 2 ) from the previous steps. In particular, a;( ) = y. 

(ii) Operators ip 1 , . . . ,(p p can also be chosen as elementary functions. An example is given in 
item (i) of Section [3.21 where <Pk(y) was constructed from the power functions. An alternative 
possibility is to choose trigonometric functions for constructing ip k (y). For instance, one can 
put 

[<Pi(v)]M = V and [<Pk+i(v)](v) = [cos(kyi), . . .,cos{ky n )] T (68) 

with y = [yi, . . . , y n ] T and k = 1, . . . ,p — 1. In this paper, we do not analyse such a possible 
choice for (p 1 , . . . , y? p . 

5.2.5 Other particular cases of transform T p 
and comparison with known transforms 

(i) Optimal non-linear filtering. The transforms T p (|36 |l -(|37 jl and T p (|63 |l -(|65j l . which are 
particular cases of the transforms given in Theorems Q and represent optimal filters that 
perform pure filtering with no signal compression. Therefore they are important in their own 
right. 

(ii) The Fourier series as a particular case of transform T p . For the case of the 
minimization problem © with no constraint if?]). T\ , . . . , J- p are determined by the expressions 
(|3(i|) and (|fi3|) -(|(i5 [) which are similar to those for the Fourier coefficients |35j . The structure of 
the model T p presented by (J3J) is different, of course, from that for the Fourier series and Fourier 
polynomial (i.e. a truncated Fourier series) in Hilbert space |35) . The differences are that T p 
transforms y (not x as the Fourier polynomial does) and that T p consists of a combination of 
three operators <p k , Qk and Tu where Tk '■ L 2 (Q,,Hk) — > L 2 (Q,Hx) is an operator, not a scalar 
as in the Fourier series |35j . The solutions (|36j) and H63|) - ()65(l of the unconstrained problem © 
are given in terms of the observed vector y, not in terms of the basis of x as in the Fourier 
series/polynomial. The special features of T p require special computation methods as described 
in Section 03 

Here, we show that the Fourier series is a particular case of the transform T p . 

Let x € L 2 (Q,H) with H a Hilbert space, and let {v\,V2, • • •} be an orthonormal basis in 
L 2 (Q,,H). For any g,h £ L 2 (Q,H), we define the scalar product (•,•) and the norm || • || B in 
L 2 (n,H) by 

(g,h) = g(u)h(u)du(uj) and \\g\\ E = (g,g} 1/2 , (69) 

Jn 

respectively. In particular, if H = M. m then 

\\9\\l = [ g(uMu;)] T du(u;) = [ \\g(u)\\ 2 du(u;) = E[\\g\\ 2 }, (70) 

i.e. -EfUgrll 2 ] is defined similarly to that in (J3J). 
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Let us consider the special case of transform T p presented in item (iii) of Section 13.21 and 
let us also consider the unconstrained problem © formulated in terms of such a T p where we 
now assume that x has the zero mean, / = O, p = oo, {i>i,i>2, • • •} is an orthonormal basis in 
L 2 ($7, H) and T k is a scalar, not an operator as before. We denote a k = J~k with a k £ KL Then 
similar to (|3U)) in Corollary^ the solution to unconstrained problem © is defined by a k such 
that 

dfc = E xtJfe with h = 1, 2, 

Here, ¥, XVk = E[xv k ] — E[x]E[v k ] = E[xv k ] = {x,v k ) since E[x] = by the assumption. 
Hence, a k = E x ^ fe is the Fourier coefficient and the considered particular case of T p {y) with T k 
determined by &k is given by 

oo 

T p {y) = Y,(x,v k )v k . (71) 

fc=i 

Thus, the Fourier series (|71|) in Hilbert space follows from j2|, © and (|36j) when T p has the 
form given in item (iii) of Section [3.21 with x, f, p, {v\,V2, • • •} and J-j- as above. 

(iii) The Wiener filter as a particular case of transform T p (|63|) - (|65j) . In the following 
CorollariesElandEJ we show that the filter T p guarantees better accuracy than that of the Wiener 
filter. 

Corollary 3. Let p = 1, E[x] = 0, E[y] = 0, ip x = 1, Qi = 1 and A x = O or A x = E xy E\ y . 
Then T p is reduced to the filter T such that 

[f(y)](co)=f[y(co)] 

with 

f = E xy E\ y . (72) 

Remark 7. The unconstrained linear filter, given by (|72|). has been proposed in [Jj. The 
filter Q72JI is treated as a generalisation of the Wiener filter. 

Let x, t>i, . . . ,v p be the zero mean vectors. The transform T p , applied to x, Vi,...,v p , is 
denoted by Tw, P - 

Corollary 4. The error E[\\x — 7^p(y)|| 2 ] associated with the transform 7jy )P is smaller than 

P 

the error E[\\x — T(y)\\ 2 ] associated with the Wiener filter (2] by ]T] \\E X y k (E- - )^|| 2 , i.e. 

k=2 k k 



E[\\x - T w , p (y)f] = E[\\x - f(y)f] ~ E PW^lg^ll 2 - ( 73 ) 



Proof. It is easy to show that 

E[\\x-f{y)f] = \\EliY-l\ExvMilyf, (74) 

and then (J73J) follows from ^^ and ([71}. ■ 

(iv) The KIT as a particular case of transform T p ° (|I3jl -(|I7) J) . The KLT [7] follows 
from (J1H)) -(|1H |) as a particular case if / = O, p = 1, tp 1 =I, Qi = I and A\ = O. 

To compare the transform T p with the KLT [7j, we apply T p , represented by (|4*3*)) -(|1H |) . to 
the zero mean vectors x, Vi,...,v p as above. We write 71* for such a version of 71°, and T Khr 
for the KLT 0. 
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Corollary 5. The error E[\\x — T*(y)\\ 2 ] associated with the transform 71* is smaller than the 

V Vk 

error E[\\x — T KLT (y)\\ 2 ] associated with the KLT [7j by ^ ^ /??•, i.e. 

k=2j=l 

P Vk 

E[\\x - T;{y)f] = Em - T KLT (y)\\ 2 ] - EXX- ( 75 ) 

k=2 3=1 

Proof. The error associated with F Km: [I] is represented by (|l6^1 for p = 1, 

m 

E[\\x-T K , T (y)\\ 2 } = H^ff-E^r (76) 

3=1 

Then J7BJ) follows from (flHf) and l(7S|l . ■ 

(v) ITie transform 26 as a particular case of transform T p . The transform [2T] 
follows from (J3J) as a particular case if / = O, p = 2, (Pi(y) = y, <P2(v) = ?/ 2 an d Qi = 0-2 = % 
where y 2 is defined by y 2 (u>) = [y 2 , ■ ■ ■ , y„] . We note that transform |26| has been generalized 
in ETj. 

(vi) TTie transforms [27| as particular cases of transform T p . The transform [27] 
follows from © if Q fc = X, ¥> fc (y) = y k where y k = (y, . . . ,y) e L 2 (n,M. nk ), R nk is the fcth 
degree of M n , and if .F^ is a /c-linear operator. 

To compare transform 71° and transform Tj^n |27j of rank r, we write Zj = yjy, z = 
[z\, . . . , z n ] T , s = [1 y T z T ] T and denote by ai, . . . , av the non-zero singular values associated 
with the truncated SVD for the matrix E xs (E S s )'. Such a SVD is constructed similarly to that 

in E9M33J. 

Corollary 6. Let A p = E E /?L— E «? and let A p > 0. TTie error £[||a;-7: (2/)|| 2 ] associated 

k=lj=l j=l 

with the transform 71° is /ess than the error E\^\x — Irmiy) || 2 ] associated with the transform Tvm 
by A p , i.e. 

E[\\x - T»|| 2 ] = E[\\x - T m {y)\\ 2 ] - A p . (77) 

Proof. It follows from gT\ that 

E[\\x-T :27 (y)\\ 2 ] = \\Eli 2 \\ 2 -J2» 2 3. (78) 

3=1 

Then (JZZI) follows from (@§J) and l|7gj). ■ 

We note that, in general, a theoretical verification of the condition A p > is not straightfor- 
ward. At the same time, for any particular x and y, A p can be estimated numerically. 

Although the transform 71° includes the transform 7h%\, the accuracy of 7^y| is, in general, 
better than that of 71° for the same degrees of 71° and 7j^7|. This is because Tvm implies more 
terms. For instance, Trm of degree two consists of n + 1 terms while 71°, for p = 2, consists 
of three terms only. If for a given p, the condition A p > is not fulfilled, then the accuracy 
E[\\x — 7L°(y)|| 2 ] can be improved by increasing p or by applying the iterative method presented 
in [2H1. 

(vii) Unlike the techniques presented in [131 114] . our method implements simultaneous 
filtering and compression, and provides this data processing in probabilistic setting. The idea of 
implicitly mapping the data into a high-dimensional feature space |HJ I12| 115) could be extended 
to the transform presented in this paper. We intend to develop such an extension in the future. 
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6 Numerical realization 

6.1. Orthogonalization. Numerical realization of transforms of random vectors implies a 
representation of observed data and estimates of covariance matrices in the form of associated 
samples. 

For the random vector life, we have q realizations, which are concatenated into n x q mat- 
rix U k . A column of U k is a realization of u k - Thus, a sequence of vectors ui,...,u p is 
represented by a sequence of matrices U\, . . . , U p . Therefore the transformation of U\, . . . , u p to 
orthonormal or orthogonal vectors v±, . . . , v p (by Lemmata Q and GJ) is reduced to a procedure 
for matrices U±, . . . , U p and V±, . . . , V p . Here, V k £ W ixq is a matrix formed from realizations of 
the random vector v k for each k = 1, . . . ,p. 

Alternatively, matrices Vi,...,V p can be determined from known procedures for matrix or- 
thogonalization [33j . In particular, the QR decomposition [12j can be exploited in the following 
way. Let us form a matrix U = [t/f . . . U p ] € W ipxq where p and q are chosen such that np = q, 
i.e. U is square 3 . Let 

U = VR 



be the QR decomposition for U with V G W ipxq orthogonal and R S M. npxq upper triangular. 

v 



Next, we write V = [V? . . . V^] T £ R m P xq where V k G R nxq for k = 1, . . . ,p. The submatrices 



rT _ J ^j i + i> 



Vi, . . . , V^, of V are orthogonal, i.e. VjW = < . _ .' for i, j = 1, . . . ,p, as required. 

Other known procedures for matrix orthogonalization can be applied to XJ\ , . . . , U p in a similar 
fashion. 

Remark 8. For the cases when v±, . . . , v p are orthonormal or orthogonal but not orthonormal, 
the associated accuracies (|2Tj) . (|SH|) . (|1H)| and (|d^)) differ for the factors depending on (E^(^)'''. 

1/2 -I- 

In the case of orthonormal Vi, . . . , v p , (E t ,^ fe ) T = / and this circumstance can lead to an increase 
in the accuracy. 

6.2. Covariance matrices. The expectations and covariance matrices in Lemmata ^121 
and Theorems EE1 can be estimated, for example, by the techniques developed in [301 EH EH1 
1391 1401 HT] . We note that such estimation procedures represent specific problems which are not 
considered here. 

6.3. 71°, T p and T p for zero mean vectors. The computational work for 71° (Theorems ^ 

and [21, T p and T p (Corollaries ^ and [2|) can be reduced if 71°, T p and T p are applied to the zero 
mean vectors x, v\, . . . , v p given by x = x — E[x], V\ = V\ — S[vi], . . . ,v p = v p — E[v p ]. Then 
/° = O and / = O. The estimates of the original a; are then given by 

v v v 

x = E[x}+^2^(v k ), x = E[x]+J^A(vk) and x = E[x\+^T k {v k ) 

fe=l k=l k=l 

respectively. Here, F%, T k and T k are defined similarly to ((201), (EU, HH), (03), (EH) and (|65|) . 

7 Discussion 

Some distinctive features of the proposed techniques are summarized as follows. 

Remark 9. It follows from Theorems ^ and EJ and Corollaries ^ and |2 that the accuracy 
associated with the proposed transform improvs when p increases. 



'Matrix U can also be presented as U — [Ui . . . U p ] with p and q such that n = pq. 
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Remark 10. Unlike the approaches based on Volterra polynomials [2711291151] our method does 
not require computation of pseudo-inverses for large NxN matrices with N = n+n 2 + - ■ ■+ri p ~ 1 . 
Instead, the proposed transforms use pseudo-inverses of n x n matrix E„ fe „ fc . See Theorems ^ 
and [21 This leads to a substantial reduction in computational work. 



Remark 11. The idea of the recurrent transform [28] can be extended for the proposed trans- 
form in a way similar to that considered in J2S] • The authors intend to develop a theory for such 
an extension in a feasible future. 

8 Conclusions 

The new results obtained in the paper are summarized as follows. 

We have proposed a new approach to constructing optimal nonlinear transforms for random 
vectors. The approach is based on a representation of a transform in the form of the sum of p 
reduced-rank transforms. Each particular transform is formed by the linear reduced-rank oper- 
ator jF fc , and by operators ip k and Qk with k = 1, . . . ,p. Such a device allows us to improve the 
numerical characteristics (accuracy, compression ration and computational work) of the known 
transforms based on the Volterra polynomial structure [271129101] . These objectives are achieved 
due to the special "intermediate" operators ip l , . . . ,(p and Qi, . . . , Q p . In particular, we have 
proposed two types of orthogonalizing operators Qi, ■ ■ ■ ,Q P (Lemmata Q and OJ and a specific 
method for determining tp 1 , . . . , cp p (Section 15.2.4]) . Such operators reduce the determination of 
optimal linear reduced-rank operators J^, . . . , J-Q to the computation of a sequence of relatively 
small matrices (Theorems ^ and EJ) . 

Particular cases of the proposed transform, which follow from the solution of the uncon- 
strained minimization problem ©, have been presented in Corollaries Q and E] Such transforms 
are treated as new optimal nonlinear filters and, therefore, are important in their own right. 

The explicit representations of the accuracy associated with the proposed transforms have 
been rigorously justified in Theorems ^ and El and Corollaries ^ and El 

It has been shown that the proposed approach generalizes the Fourier series in Hilbert space 
(Section l5.2.5[) . the Wiener filter, the Karhunen-Loeve transform (KLT) and the known optimal 
transforms [261 1271 l2"9*] . See Corollaries [31 IU and [21 and Section l5.2.5l in this regard. In particular, 
it has been shown that the accuracies associated with the proposed transforms are better than 
those of the Wiener filter (Corollary 0J) and the KLT (Corollary [5]). 

A Appendix 

Proof of Lemma ^ Let us write 

j-i 
wi = ui and Wi = u, - ^W ifc (iu fc ) for i = l,...,p, 

k=i 

with U ik : L 2 (n, M. n ) -> L 2 (Q, M. n ) chosen so that, for k = 1, . . . , i - 1, 

E WiWk = O if % + k. (79) 

We wish (|79[l is true for any k, i.e. 

^ Wl w k = E WiWk - E[wi]E[wl] 



E 



i-\ 



Ui -^2,Uu{wi) \w 



1=1 



E 



i-l 



Ui -^Uuiwi) 



i=i 



E[w 
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= E UiWk - U ik E WkWk - E[ui]E[wl] + E[w k \E[wl] 

— ^UiWk UikiiLi WkWk — \JJ. 

Thus, Uik = E UiWk E~ kWk , and the statement (i) is true. 

It is clear that vectors V\, . . . ,v p , defined by (|12|). are orthogonal. For Q&, defined by (|13j). 
we have Qu = (Su^iuJ -1 and 

E VkVk = EliE^J^WkwlK^)- x ] - El^l^X^Elwli^X 1 ] 
= (EV^J-X^OEi&J- 1 = I. 

Hence, vi,...,v p , defined by (|12j) . are orthonormal. 

Proof of Lemma |3J We wish that E ViVk = O for i ^ k. If Z^ has been chosen so that this 
condition is true for all k = 1, . . . , i — 1 then we have 



Thus, 



k 
1=1 



Ui -^2 2 u( v i) ) v l 



i-1 



^uiv k — 2_^ ZuE VlVk — E Ui „ fc — ZikE VkVk — O. (80) 



i=i 



ZikE VkVk — E UiVk . (81) 

The necessary and sufficient condition [13] for the solution of the matrix equation (|81|) is given 
by 

E UiVk El kV E VkVk =E UiVk . (82) 

By LemmaEl (|82j) is true. Then, on the basis of [3^5 the general solution to (|81|) is given by (|16() . 
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